{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Deploying Campaigns and Filters<a class=\"anchor\" id=\"top\"></a>\n",
    "\n",
    "In this notebook, you will deploy and interact with campaigns in Amazon Personalize.\n",
    "\n",
    "1. [Introduction](#intro)\n",
    "1. [Create campaigns](#create)\n",
    "1. [Interact with campaigns](#interact)\n",
    "1. [Batch recommendations](#batch)\n",
    "1. [Wrap up](#wrapup)\n",
    "\n",
    "## Introduction <a class=\"anchor\" id=\"intro\"></a>\n",
    "[Back to top](#top)\n",
    "\n",
    "At this point, you should have several solutions and at least one solution version for each. Once a solution version is created, it is possible to get recommendations from them, and to get a feel for their overall behavior.\n",
    "\n",
    "This notebook starts off by deploying each of the solution versions from the previous notebook into individual campaigns. Once they are active, there are resources for querying the recommendations, and helper functions to digest the output into something more human-readable. \n",
    "\n",
    "As you with your customer on Amazon Personalize, you can modify the helper functions to fit the structure of their data input files to keep the additional rendering working.\n",
    "\n",
    "To get started, once again, we need to import libraries, load values from previous notebooks, and load the SDK."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](img/04.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "from time import sleep\n",
    "import json\n",
    "from datetime import datetime\n",
    "import uuid\n",
    "import random\n",
    "\n",
    "import boto3\n",
    "import botocore\n",
    "from botocore.exceptions import ClientError\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store -r"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "personalize = boto3.client('personalize')\n",
    "personalize_runtime = boto3.client('personalize-runtime')\n",
    "\n",
    "# Establish a connection to Personalize's event streaming\n",
    "personalize_events = boto3.client(service_name='personalize-events')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create campaigns <a class=\"anchor\" id=\"create\"></a>\n",
    "[Back to top](#top)\n",
    "\n",
    "A campaign is a hosted solution version; an endpoint which you can query for recommendations. Pricing is set by estimating throughput capacity (requests from users for personalization per second). When deploying a campaign, you set a minimum throughput per second (TPS) value. This service, like many within AWS, will automatically scale based on demand, but if latency is critical, you may want to provision ahead for larger demand. For this POC and demo, all minimum throughput thresholds are set to 1. For more information, see the [pricing page](https://aws.amazon.com/personalize/pricing/).\n",
    "\n",
    "Let's start deploying the campaigns."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### User Personalization\n",
    "\n",
    "Deploy a campaign for your User Personalization solution version. It can take around 10 minutes to deploy a campaign. Normally, we would use a while loop to poll until the task is completed. However the task would block other cells from executing, and the goal here is to create multiple campaigns. So we will set up the while loop for all of the campaigns further down in the notebook. There, you will also find instructions for viewing the progress in the AWS console."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"campaignArn\": \"arn:aws:personalize:us-east-1:835319576252:campaign/personalize-poc-userpersonalization\",\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"494e1bed-e18e-4eab-b955-fb33574b6e90\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Sun, 08 Nov 2020 07:49:56 GMT\",\n",
      "      \"x-amzn-requestid\": \"494e1bed-e18e-4eab-b955-fb33574b6e90\",\n",
      "      \"content-length\": \"105\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "userpersonalization_create_campaign_response = personalize.create_campaign(\n",
    "    name = \"personalize-poc-userpersonalization\",\n",
    "    solutionVersionArn = userpersonalization_solution_version_arn,\n",
    "    minProvisionedTPS = 1\n",
    ")\n",
    "\n",
    "userpersonalization_campaign_arn = userpersonalization_create_campaign_response['campaignArn']\n",
    "print(json.dumps(userpersonalization_create_campaign_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SIMS\n",
    "\n",
    "Deploy a campaign for your SIMS solution version. It can take around 10 minutes to deploy a campaign. Normally, we would use a while loop to poll until the task is completed. However the task would block other cells from executing, and the goal here is to create multiple campaigns. So we will set up the while loop for all of the campaigns further down in the notebook. There, you will also find instructions for viewing the progress in the AWS console."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"campaignArn\": \"arn:aws:personalize:us-east-1:835319576252:campaign/personalize-poc-SIMS\",\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"6e4cb4c4-a5ae-4892-9447-8fce51323048\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Sun, 08 Nov 2020 07:49:55 GMT\",\n",
      "      \"x-amzn-requestid\": \"6e4cb4c4-a5ae-4892-9447-8fce51323048\",\n",
      "      \"content-length\": \"90\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "sims_create_campaign_response = personalize.create_campaign(\n",
    "    name = \"personalize-poc-SIMS\",\n",
    "    solutionVersionArn = sims_solution_version_arn,\n",
    "    minProvisionedTPS = 1\n",
    ")\n",
    "\n",
    "sims_campaign_arn = sims_create_campaign_response['campaignArn']\n",
    "print(json.dumps(sims_create_campaign_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Personalized Ranking\n",
    "\n",
    "Deploy a campaign for your personalized ranking solution version. It can take around 10 minutes to deploy a campaign. Normally, we would use a while loop to poll until the task is completed. However the task would block other cells from executing, and the goal here is to create multiple campaigns. So we will set up the while loop for all of the campaigns further down in the notebook. There, you will also find instructions for viewing the progress in the AWS console."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"campaignArn\": \"arn:aws:personalize:us-east-1:835319576252:campaign/personalize-poc-rerank\",\n",
      "  \"ResponseMetadata\": {\n",
      "    \"RequestId\": \"2a726891-98f0-46f4-b4f6-c36da4c683f2\",\n",
      "    \"HTTPStatusCode\": 200,\n",
      "    \"HTTPHeaders\": {\n",
      "      \"content-type\": \"application/x-amz-json-1.1\",\n",
      "      \"date\": \"Sun, 08 Nov 2020 07:49:56 GMT\",\n",
      "      \"x-amzn-requestid\": \"2a726891-98f0-46f4-b4f6-c36da4c683f2\",\n",
      "      \"content-length\": \"92\",\n",
      "      \"connection\": \"keep-alive\"\n",
      "    },\n",
      "    \"RetryAttempts\": 0\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "rerank_create_campaign_response = personalize.create_campaign(\n",
    "    name = \"personalize-poc-rerank\",\n",
    "    solutionVersionArn = rerank_solution_version_arn,\n",
    "    minProvisionedTPS = 1\n",
    ")\n",
    "\n",
    "rerank_campaign_arn = rerank_create_campaign_response['campaignArn']\n",
    "print(json.dumps(rerank_create_campaign_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### View campaign creation status\n",
    "\n",
    "As promised, how to view the status updates in the console:\n",
    "\n",
    "* In another browser tab you should already have the AWS Console up from opening this notebook instance. \n",
    "* Switch to that tab and search at the top for the service `Personalize`, then go to that service page. \n",
    "* Click `View dataset groups`.\n",
    "* Click the name of your dataset group, most likely something with POC in the name.\n",
    "* Click `Campaigns`.\n",
    "* You will now see a list of all of the campaigns you created above, including a column with the status of the campaign. Once it is `Active`, your campaign is ready to be queried.\n",
    "\n",
    "Or simply run the cell below to keep track of the campaign creation status."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "At least one campaign build is still in progress\n",
      "At least one campaign build is still in progress\n",
      "At least one campaign build is still in progress\n",
      "At least one campaign build is still in progress\n",
      "At least one campaign build is still in progress\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "in_progress_campaigns = [\n",
    "    userpersonalization_campaign_arn,\n",
    "    sims_campaign_arn,\n",
    "    rerank_campaign_arn\n",
    "]\n",
    "\n",
    "max_time = time.time() + 3*60*60 # 3 hours\n",
    "while time.time() < max_time:\n",
    "    for campaign_arn in in_progress_campaigns:\n",
    "        version_response = personalize.describe_campaign(\n",
    "            campaignArn = campaign_arn\n",
    "        )\n",
    "        status = version_response[\"campaign\"][\"status\"]\n",
    "        \n",
    "        if status == \"ACTIVE\":\n",
    "            print(\"Build succeeded for {}\".format(campaign_arn))\n",
    "            in_progress_campaigns.remove(campaign_arn)\n",
    "        elif status == \"CREATE FAILED\":\n",
    "            print(\"Build failed for {}\".format(campaign_arn))\n",
    "            in_progress_campaigns.remove(campaign_arn)\n",
    "    \n",
    "    if len(in_progress_campaigns) <= 0:\n",
    "        break\n",
    "    else:\n",
    "        print(\"At least one campaign build is still in progress\")\n",
    "        \n",
    "    time.sleep(60)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Filters <a class=\"anchor\" id=\"interact\"></a>\n",
    "[Back to top](#top)\n",
    "\n",
    "Now that all campaigns are deployed and active, we can create filters. Filters can be created for both Items and Events. A few common use cases for filters in Video On Demand are:\n",
    "\n",
    "Categorical filters based on Item Metadata - Often your item metadata will have information about thee title such as Genre, Keyword, Year, Decade etc. Filtering on these can provide recommendations within that data, such as action movies.\n",
    "\n",
    "Events - you may want to filter out certain events and provide results based on those events, such as moving a title from a \"suggestions to watch\" recommendation to a \"watch again\" recommendations.\n",
    "\n",
    "Lets look at the item metadata and user interactions, so we can get an idea what type of filters we can create."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a dataframe for the items by reading in the correct source CSV\n",
    "items_df = pd.read_csv(data_dir + '/item-meta.csv', sep=',', index_col=0)\n",
    "#interactions_df = pd.read_csv(data_dir + '/interactions.csv', sep=',', index_col=0)\n",
    "\n",
    "# Render some sample data\n",
    "items_df.head(10)\n",
    "#interactions_df.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now what we want to do is determine the genres to filter on, for that we need a list of all genres. First we will get all the unique values of the column GENRE, then split strings on `|` if they exist, everyone will then get added to a long list which will be converted to a set for efficiency. That set will then be made into a list so that it can be iterated, and we can then use the create filter API."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "unique_genre_field_values = items_df['GENRE'].unique()\n",
    "\n",
    "genre_val_list = []\n",
    "\n",
    "def process_for_bar_char(val, val_list):\n",
    "    if '|' in val:\n",
    "        values = val.split('|')\n",
    "        for item in values:\n",
    "            val_list.append(item)\n",
    "    elif '(' in val:\n",
    "        pass\n",
    "    else:\n",
    "        val_list.append(val)\n",
    "    return val_list\n",
    "    \n",
    "\n",
    "for val in unique_genre_field_values:\n",
    "    genre_val_list = process_for_bar_char(val, genre_val_list)\n",
    "\n",
    "genres_to_filter = list(set(genre_val_list))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "genres_to_filter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With this we now have all of the genres that exist in our dataset. A soft limit of Personalize at this time is 10 total filters, given we have a larger number of genres, we will select 7 at random to leave room for 2 interaction based filters later and an additional filter for year based recommendations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "genres_to_filter = random.sample(genres_to_filter, 7)\n",
    "genres_to_filter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now create a list for the metadata genre filters and then create the actual filters with the cells below. Note this will take a few minutes to complete."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a list for the filters:\n",
    "meta_filter_arns = []"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "# Iterate through Genres\n",
    "for genre in genres_to_filter:\n",
    "    # Start by creating a filter\n",
    "    try:\n",
    "        createfilter_response = personalize.create_filter(\n",
    "            name=genre,\n",
    "            datasetGroupArn=dataset_group_arn,\n",
    "            filterExpression='INCLUDE ItemID WHERE Items.GENRE IN (\"'+ genre +'\")'\n",
    "        )\n",
    "        # Add the ARN to the list\n",
    "        meta_filter_arns.append(createfilter_response['filterArn'])\n",
    "        print(\"Creating: \" + createfilter_response['filterArn'])\n",
    "    \n",
    "    # If this fails, wait a bit\n",
    "    except ClientError as error:\n",
    "        # Here we only care about raising if it isnt the throttling issue\n",
    "        if error.response['Error']['Code'] != 'LimitExceededException':\n",
    "            print(error)\n",
    "        else:    \n",
    "            time.sleep(120)\n",
    "            createfilter_response = personalize.create_filter(\n",
    "                name=genre,\n",
    "                datasetGroupArn=dataset_group_arn,\n",
    "                filterExpression='INCLUDE ItemID WHERE Items.GENRE IN (\"'+ genre +'\")'\n",
    "            )\n",
    "            # Add the ARN to the list\n",
    "            meta_filter_arns.append(createfilter_response['filterArn'])\n",
    "            print(\"Creating: \" + createfilter_response['filterArn'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets also create 2 event filters for watched and unwatched content"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a dataframe for the interactions by reading in the correct source CSV\n",
    "interactions_df = pd.read_csv(data_dir + '/interactions.csv', sep=',', index_col=0)\n",
    "\n",
    "# Render some sample data\n",
    "interactions_df.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets also create 2 event filters for watched and unwatched content"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "createwatchedfilter_response = personalize.create_filter(name='watched',\n",
    "    datasetGroupArn=dataset_group_arn,\n",
    "    filterExpression='INCLUDE ItemID WHERE Interactions.event_type IN (\"watch\")'\n",
    "    )\n",
    "\n",
    "createunwatchedfilter_response = personalize.create_filter(name='unwatched',\n",
    "    datasetGroupArn=dataset_group_arn,\n",
    "    filterExpression='EXCLUDE ItemID WHERE Interactions.event_type IN (\"watch\")'\n",
    "    )\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally since we now have the year available in our item metadata, lets create a decade filter to recommend only moviees releaseed in a given decade, for this workshop we will choosee 1970s cinema. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "createdecadefilter_response = personalize.create_filter(name='1970s',\n",
    "    datasetGroupArn=dataset_group_arn,\n",
    "    filterExpression='INCLUDE ItemID WHERE Items.YEAR >= 1970 AND Items.YEAR < 1980'\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before we are done we will want to add those filters to a list as well so they can be used later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "interaction_filter_arns = [createwatchedfilter_response['filterArn'], createunwatchedfilter_response['filterArn']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "decade_filter_arns = [createdecadefilter_response['filterArn']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store sims_campaign_arn\n",
    "%store userpersonalization_campaign_arn\n",
    "%store rerank_campaign_arn\n",
    "%store meta_filter_arns\n",
    "%store interaction_filter_arns\n",
    "%store decade_filter_arns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You're all set to move on to the last exploratory notebook: `05_Interacting_with_Campaigns_and_Filters.ipynb`. Open it from the browser and you can start interacting with the Campaigns and gettign recommendations!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Release Resources"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%javascript\n",
    "Jupyter.notebook.save_checkpoint();\n",
    "Jupyter.notebook.session.delete();"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
